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OIL-BODY PROTEINS AS CARRIERS OF 
5 HIGH-VALUE PEPTIDES IN PLANTS 

INTRODUCTION 

Technical Field 

10 This invention relates to a method for the 

production by recombinant means of a protein of interest 
which is easily purified from host cell components. The 
method is exemplified by expression of the protein of 
interest in plants, particularly seeds, as a chimeric 

15 peptide comprising an oil-bociy protein and the protein of 
interest . 

Background 

A variety of proteins have been expressed in 

20 plants. However, while the general feasibility of 

obtaining expression of foreign proteins in plants has been 
demonstrated, obtaining purified proteins from this source 
has some limitations. These limitations include the 
purification step necessary to obtain pure protein 

25 essentially free of plant derived materials and the 

degradation that may occur in extracts prepared during the 
purification procedure when the recombinant proteins 
obtained are in contact with aqueous buffers. 

Plants bearing oilseeds such as soybean, 

30 rapeseed, sunflower and a number of other plant species 
such as corn, carrot, etc., store triglycerides in their 
seeds. In the plant, these triglycerides act as a source of 
energy for a germinating seed and the subsequent seedling. 
The triglycerides are widely used as vegetable oils in foods 

35 and in food preparation and also for some industrial 
applications . 

Triglycerides are immiscible with water and 
partition by floating on the surface of aqueous solutions o 
by forming small globules or liposomes as a suspension in 
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the aqueous phase. Such globules will naturally coalesce if 
they are not stabilized by a modified surface layer. This 
coalescence can result in a suspension of globules of random 
sizes. In seeds r when triglyceride is stored, the oil 
5 globules are actually encapsulated lipid or oil bodies 
normally of uniform size. Associated with the surface of 
these oil bodies is a half unit membrane studded with 
several proteins, generally referred to as oil-body 
proteins . 

10 At least one class of oil-body proteins has some 

characteristics which are highly conserved between species. 
This class of oil-body proteins is referred to as an 
"oleosin. " The hydrophilic N- and C -termini of these 
proteins appear to be quite divergent, whereas the 

15 lipophilic internal region (central core) appears to be 
highly conserved between species . The oleosins are 
strongly associated with the oil bodies; this strong 
association to the oil-bodies may, in major part, be due to 
the lipophilic nature of these central core. It is 

20 therefore of interest to determine whether oil body 

proteins such as oleosins may be useful in a method for the 
production of recombinant proteins by providing a means for 
separation of the recombinant proteins from plant derived 
materials . 

25 

Relevant Literature 

The production of foreign (recombinant) peptides 
in plants has been investigated using a variety of 
approaches including transcriptional fusions using a strong 

30 constitutive plant promoter (e.g., from cauliflower mosaic 
virus — Sijmons et al. (1990) Bio /Technology, S:217-221) and 
the coding of a foreign protein; transcriptional fusions 
with organ specific sequences (Radke et al. (1988) Theoret, 
Appl. Genet., 75:685-694); and translational fusions which 

35 require subsequent cleavage of a recombinant protein (Vander 
Kerkove et al. (1989) Bio /Technology, 7:929-932). Foreign 
proteins which have been expressed in plant cells include 
active proteins from bacteria (Fraley et al. (1983) Proc. 
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Nat'l. Acad. Sci. USA, 50:4803-4807), animals (Misra and 
Gedamu (1989) Theor. Appl. Genet., 78 z 161-168), fungi and 
other plant species (Fraley et al . (1983) Proc. Nat'l. Acad. 
Sex. USA, 80:4803-4807). 
5 Some proteins, normally markers of integration, 

have been expressed in a tissue-specific manner, including 
some in seeds (Sen Gupta-Gopalan et al. (1985) Proc. Nat'l. 
Acad. Sci. USA, 32:3320-3324); Radke et al. (1988) Theor. 
Appl. Genet., 75:685-694). These reports have concentrated 

10 specifically on the use of seed-storage protein promoters as 
a means of deriving seed-specific expression. Using such a 
system, Vanderkerkove et al. (1989) Bio/Technol . , 7:929-932, 
expressed a high value peptide ( leu-enkephalin) in seeds of 
Arabidopsis thaliana and Brasslca napus. The yield of this 

15 peptide was quite low, but demonstrates the feasibility of 
expression of an animal peptide hormone in plant tissues. 
Maize oleosin has been expressed in seed oil bodies in 
Brassica napus transformed with a maize oleosin gene. The 
gene was expressed under the control of regulatory elements 

20 from a Brassica gene encoding napin, a major seed storage 
protein. The temporal regulation and tissue specificity of 
expression was reported to be correct for a napin gene 
promotor/terminator. See , Lee et al., Proc. Nat'l. Acad. 
Sci. (USA) (1991) 88:6181-6185. 

25 The oil globules which are produced in seeds all 

appear to be of a similar size, indicating that they are 
stabilized (Huang A.H.C. (1985) in Modern Meths. Plant 
Analysis, Vol. 1:145-151 Springer- Verlag, Berlin). On 
closer inspection, it has been found that these are not 

30 simple oil-globules, but rather oil-bodies surrounded by a 
membrane. These oil-bodies have been variously named by 
electron microscopists , oleosomes, lipid bodies and 
spherosomes (Gurr MI. (1980) in The Biochemistry of Plants, 
4:205-248, Acad. Press, Orlando, Fla). The oil-bodies of a 

35 few species have been studied and the general conclusion is 
that they are encapsulated by an unusual "half -unit " 
membrane comprising not a classical lipid bilayer, but 
rather a single amphophilic layer with hydrophobic groups r 
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the inside and hydrophilic groups on the outside (Huang 
A.H.C. (1985) in Modern Meths. Plant Analysis, Vol. 1:145- 
151 Springer- Verlag, Berlin) . 

Analysis of the contents of lipid bodies has 
5 demonstrated that apart from triglyceride and membranous 
material f there are also several polypeptides /proteins 
associated with the surface or lumen of the oil body 
(Bowman-Vance and Huang (1987) J. Biol. Chem., 252:11275- 
11279 f Murphy et al. (1989) Biochem. J. , 255:285-293, Taylor 

10 et al. (1990) Planta, 2S2:18-26). Oil-body proteins have 
been identified in a wide range of taxonomically diverse 
species (Moreau et al. (1980) Plant Physiol., 55:1176-1180; 
Qu et al. (1986) Biochem. J. , 235:57-65) and been shown to 
be uniquely localized in oil-bodies and not found in 

15 organelles of vegetative tissues. In Brassica napus 

(rape seed) there are at least three polypeptides associated 
with the oil-bodies of developing seeds (Taylor et al. 
(1990), Planta, 102:18-26). The numbers and sizes of oil- 
body associated proteins may vary from species to species. 

20 In corn , for example r -there are two immunologically 

distinct polypeptide classes found in oil-bodies (Bowman- 
Vance and Huang (1988) J. Biol. Chem., 253:1476-1481). 
Oleosins have been shown to comprise regions of alternate 
hydrophilicity, hydrophobic ity and hydrophilic ity (Bowman- 

25 Vance and Huang (1987) J . Biol. Chem., 252:11275-11279). 
The amino acid sequences of oleosins from corn, rapeseed, 
and carrot have been obtained. See Qu and Huang (1990) J. 
Biol. Chem., 255:2238-2243, Hatzopoulos et al. (1990) Plant 
Cell, 2:457-467, respectively. In an oilseed such as 

30 rapeseed, oleosin may comprise between 8% (Taylor et al. 
(1990) Planta, 151:18-26) and 20% (Murphy et al. (1989) 
Biochem. J. , 255:285-293) of total seed protein. Such a 
level is comparable to that found for many seed storage 
proteins . 

35 Genes encoding oil-body proteins have been 

reported for two species, maize (Zea mays, Bowman-Vance and 
Huang (1987) J. Biol. Chem., 262*. 1127 5-1127 9; and Qu and 



\'SDCCID: <WC S321320A1> 



SUBSTITUTE SHEET 



WO 93/21320 



PCT/CA92/00161 



5 

Huang (1990) J. Biol. Chem. , 255:2238-2243) and carrot 
(Hatzopoulos et al. (1990) Plant Cell, 2:457-467). 

SUMMARY OF THE INVENTION 

5 

Methods and compositions are provided for the 
production of peptides which may be easily purified from 
host proteins. The method includes the steps of preparing u 
chimeric DNA construct which includes a sequence encoding an 
10 oil-body specific sequence comprising the coding sequence of 
a seed-specific oil-body protein gene, or a sequence 
encoding at least a portion of the hydrophobic core of an 
oil-body protein, and a coding sequence for a peptide of 
interest from which an expression cassette containing the 

15 chimeric DNA construct can be prepared; transforming a host 
cell with the expression cassette under genomic integration 
conditions; and growing the resulting transgenic plant to 
produce seed in which the polypeptide of interest is 
expressed as a fusion protein with the oleosin. 

20 The polypeptide of interest may be purified by 

isolating oil-bodies from the cells of the seed f and 
disrupting the oil-bodies so that the fusion protein is 
released. The oil-body protein is then easily separated 
from other proteins and plant derived material by phase 

25 separation. Optionally a cleavage site may be located at 
least one of prior to the N-terminus and after the C- 
terminus of the polypeptide of interest allowing the fusion 
polypeptide to be cleaved and separated by phase separation 
into its component peptides. The production system thus 

30 provides for targeting of the chimeric peptide by its oil- 
body protein functionality to the oil bodies which, in turn, 
permits rapid purification of the polypeptide of interest. 
This production system finds utility in the production of 
many peptides such as those with pharmaceutical, enzymic, 

35 rheological and adhesive properties. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Fig, IA. shows the nucleotide sequence and 
deduced amino-acid sequence (17 kDa protein) of an oil- 
5 body protein gene (oleosin) from Arabidopsis thaliana. 

Underlined are the direct repeats (Rl and R2) and inverted 
repeat (T), a CACA, TATA, TAAT and polyadenylation signals. 
The intron sequence is printed in lower case and a putative 
ABA-binding site is indicated in bold. 

10 Fig. IB. shows a comparison of the sequences of 

oil-body 16 Kd protein from carrot , an 18 Kd and 16 Kd oil- 
body protein from maize and a 17 Kd oil-body protein from 
Arabidopsis thai i ana indicating conserved and divergent 
regions of the proteins; the amino acid sequences are 

15 aligned to show the conservation of sequence in the central 
region of the proteins. 

Fig. 2. shows constructs used for the fusion of 
oil-body protein genes with genes encoding foreign 
peptides. IA is a C-terminal fusion of a desired peptide to 

20 OBP; IB is an N- terminal fusion of a desired peptide to OBP; 
II is an internal fusion of a desired peptide within OBP; 
and III is an inter-dimer translational fusion of desired 
peptide enclosed between two substantially complete oil body 
protein targeting sequences. In the upper portion of 

25 Figure (A) are shown the DNA constructs used for 

translational fusions of desired peptides to oil-body 
proteins. In the lower portion of Figure (B) are shown the 
configurations of the gene products , shown on the upper 
portion of the translation and the delivery to the oil 

30 bodies. The key to the figure is as follows: bottom left- 
top right hatched box represents an OBP promoter or other 
seed specific promoter; bottom right -top left hatched box 
represents a desired peptide coding sequence; open box 
represents an oil-body protein coding sequence or synthetic 

35 targeting sequence based on OBP conserved motifs; vertical- 
horizontal hatched box represents a gene terminator 
containing a polyadenylation signal; hatched circle 
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represents a protease recognition motif; corkscrew line 
represents a native C- or N-terminal of OBP. 

Fig. 3. shows a detailed arrangement for 
construction of a C-terminal fusion. Shown is the 
5 arrangement is a collagenase recognition motif coding 
sequence as a linker in the fusion of a typical oil-body 
protein gene and a fusion peptide, to be linked here using 
an Ncol, for cloning and expression in plants. 

Fig. 4. shows schematically the process of 

10 construction of fusion peptide vectors, their introduction 
into plants and subsequent extraction and assay of the 
desired recombinant peptide. 

Figure 5 shows a schematic representation of the 
construction of pCGOBPILT. The broken line box represents 

15 an oleosin promoter; the top left-bottom right hatched box 
represents an oleosin coding sequence; the horizontal- 
vertical hatched box represents an intron; the dotted box 
represents a 3' non- translated sequence; and the widely- 
spaced top left-bottom right hatched box represents an 

20 inter leukin-l-y3 sequence equipped with a sequence encoding a 
protease cleavage site (Factor Xa or thrombin immediately 
upstream) . 

Figure 6 shows the design of oligonucleotide 
GVR11. In Figure 3A represents the 3' coding sequence of 

25 the A. thaliana oleosin, translationally fused to the 
factor Xa/IL-1-/? coding sequence followed by a TAA stop 
codon. For future cloning purposes, a Pvul and Sail 
restriction enzyme recognition site are included. The 
creation of a Pvul restriction site resulted in the 

30 additional coding sequence for an alanine (ala) . 
Underlined are the restriction enzyme recognition 
sequences. Overlined are the A. thaliana oleosin sequences 
and the factor Xa recognition sequence. The actual cleavage 
site is indicated with an asterisk (*). In Figure 3B, the 

35 sequence of GVR11 is shown. In order to make a fusion with 
the A. thaliana oleosin, the primer GVR11 needs to be a 
sequence complementary to the top strand . 
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Figure 7 shows the nucleotide sequence of OBPILT. 
Underlined is the sequence encoding IL-1-/J; the sequence 
encoding the factor Xa recognition site is indicated in 
bold. The nopaline synthase terminator sequence is 
5 indicated in lower case letters. 



. DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

10 In accordance with the subject invention, methods 

and compositions are provided for production of peptides 
which are easily purified. The subject method includes the 
steps of preparing an expression cassette containing DNA 
sequences encoding a sufficient portion of an oil body 

15 specific sequence, such as an oleosin, to provide for 
targeting to an oil body and the peptide of interest; 
transforming the expression cassette into a plant cell host; 
generating a transgenic plant and growing it to produce seed 
in which the chimeric protein is expressed and translocated 

20 to the oil bodies. The chimeric peptide comprises the 
peptide of interest and an oil body protein such as an 
oleosin. The peptide of interest generally is a foreign 
peptide normally not expressed in seeds or found on the oil- 
body. The use of an oil-body protein as a carrier or 

25 targeting means provides a simple mechanism to obtain 

purification of the foreign protein. The chimeric protein 
is separated away from the bulk of cellular protein in a 
single step (such as centrifugation or flotation); the 
protein is also protected from degradation during extinction 

30 as the separation also removes non-specific proteases from 
contact with the oil-bodies. The gene encoding the foreign 
peptide may be derived from any source, including plant, 
bacterial, fungal or animal source. Desirably, the chimeric 
peptide will contain sequences which allow for cleavage of 

35 the peptide of interest from the oleosin. The method may be 
employed to express a variety of peptides which are then 
easily purified. 
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Targeting a foreign , recombinant protein to the 
oil-body imparts several advantages, including the 
following. The' protein can be separated from the bulk of 
cellular contents after cell lysis by centrifugation. The 
5 oil-body fraction will float on the surface of the extract. 
The protein can optionally be provided with a peptide linker 
containing a protease recognition site. This permits 
release of the peptide from the oil-body. The protein can 
be introduced into a recombinant polypeptide in such a way 
10 that it is within a lipophilic conserved region. This 
results in the internalization of the recombinant peptide 
into the oil-body , thus protecting it from protease attack. 

The expression cassette generally will include in 

15 the 5 '-3' direction of transcription, a transcriptional and 
translational regulatory region capable of expression 
in developing seed, typified by the promoter and upstream 
regions associated with an oil body protein, which will 
provide for expression of the chimeric protein in seed, * 

20 DMA sequence encoding a chimeric peptide comprising an amino 
acid sequence to provide an oil body targeting means and a 
protein of interest, and a transcriptional and translational 
termination region functional in plants. One or more 
introns may also be present. 

25 The oil-body specific sequence finds analogy in 

fragments of oil-body proteins, particularly oleosins. The 
oil-body specific sequence may be the same as that of a 
sequence obtainable from an oil-body protein, are which has 
sufficient homology to provide for the desired targetim of 

30 a protein of interest to an oil body. By "obtainable" is 
intended an amino acid sequence which may be natural, 
synthetic or a combination, sufficiently similar to a native 
oil body protein amino acid sequence to provide the desired 
targeting. Of particular interest is the central 

35 hydrophobic domain of oil body proteins which appears to be 
highly conserved among different plant species, and 
fragments thereof and homologous sequences at the amino 
acid level. 
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The deduced amino acid sequence for an 
Arabidopsis thaliana oil-body protein is as follows: 

10 20 
M-N-G-R-D-R-D-Q-Y-Q-M-S-G-R-G-S-D-Y-S-K- 
5 30 40 

S-R-Q-I-A-K-A-A-T-A-V-T-A-G-G-S-L-L-V-L- 

50 60 
L-S-L-T-L-V-G-T-V-I-A-L-T-V-A-T-P-L-L-V- 

70 80 
10 I-S-S-T-I-L-V-P-A-L-I-T-V-A-L-L-I-T-G-S- 

90 100 
L-S-S-G-G-F-G-I-A-A-I-T-V-F-S-W-I-Y-K*Y- 

110 120 
L-L-I-E-H-P-Q-G-S-D-K-L-D-S-A-R-M-K-L-G- 
15 130 - 140 

S-K-A-Q-D-L-K-D-R-A-Q-Y-Y-G-Q-Q-H-T-G-W- 
-150 

E-H-D-R-D-R-T-R-G-G-Q-H-T-T 

20 Amino acids from about 25-101 comprise the central 
hydrophobic domain. 

Of particular interest as a targeting means for 
some applications are oil -body specific sequences or 
fragments thereof of the following formula which provide 

25 for targeting to an oil body: 

pp 1 - aa 25 - aa 26 - V _ V - T - L- aa 21 - P- 

A A A T 

30 aa34_ G - G- aa36 - L - L- aa 39 - L- aa 41 - 

M 

G _ I •_ aa *4_ L _ aa 46_ aa 47_ t _ L _ I _ 

35 S L S V V 



40 



aa 51 - L- a a 53 - V- A- T- P- L _ a a 59 - L _ 



L _ F _ s- P- V _ L _ v- P- A- A _ L _ aa 73 

I - I LI 
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aa 74 - aa 75 - aa 76 - aa 77 - aa 78 - G- F- L- 



5 G L 

S - S- aa 87 - G- V - aa 89 - aa 90 - L - S - _ 
T I IT 

10 aa 93 aa 94 - S - aa 96 - aa 97 - aa 98 - aa 99 - 

T 

100 101 2 

aa - aa - pp 

15 wherein: 

1 2 

PP and PP are the same or different, and may be 
the same as or different from a natural oil-body protein, 
usually different; they may be hydrogens, indicating the 
terminal portion of the indicated polypeptide or may be 

20 polypeptides having a total of up to 1000 amino acids, moire, 
usually of up to about 500 amino acids, and may have a total 
of as few as 1 amino acid, or may individually or separately 
be polypeptides of from 1-100 amino acids, more usually from 
about 1-75 amino acids, more particularly from about 5-50 

25 amino acids; these polypeptides will have specific 

applications in modifying a specifically described sequence 

for a predetermined purpose; 
25 

aa may be any amino acid, particularly a 

neutral aliphatic amino acid, generally of 3-6 carbon 

30 atoms, more particularly leucine or alanine; 
26 

aa is a neutral aliphatic amino acid, 
particularly alanine or an hydroxy substituted amino acid of 
from 3-4 carbon atoms, particularly threonine or a basic 
amino acid of from 5-6 carbon atoms, particularly lysine; 

35 aa 31 is a neutral unsubstituted aliphatic amino 

acid of from 3-6 carbon atoms, particularly alanine, valine 
or leucine or an aromatic unsubstituted amino acid, 
particularly phenylalanine; 

aa 33 is a neutral unsubstituted aliphatic amino 

40 acid of from 3-6 carbon atoms, particularly alanine, valine 
or leucine or an oxy-substituted aliphatic amino acid, 
particularly threonine ; 
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aa^ is a neutral aliphatic unsubstituted amino 

acid of from 3-5 carbon atoms, particularly leucine or a 

neutral aliphatic oxy-substituted amino acid of from 3-4 

carbon atoms, particularly threonine or serine; 
37 

5 aa is a neutral unsubstituted amino acid, 

particularly leucine or a thio-substituted amino acid, 

particularly methionine; 
39 

aa is a neutral aliphatic unsubstituted amino 
acid, particularly valine or an aromatic unsubstituted amino 
10 acid, particularly phenylalanine? 

aa 1 is a neutral aliphatic unsubstituted or oxy- 
substituted amino acid, particularly alanine, leucine or 
serine ; 

44 

aa is a neutral aliphatic unsubstituted or oxy- 
15 substituted amino acid, particularly alanine, isoleucine or 
threonine ; 

46 

aa is a neutral aliphatic unsubstituted amino 

acid or an oxy-substituted amino acid, particularly alanine, 

valine or threonine; 
47 

20 aa is a neutral aliphatic unsubstituted amino 

acid, particularly glycine or alanine; 
59 

aa is a neutral aliphatic or aromatic 
unsubstituted amino acid, particularly leucine or 
phenylalanine; 

25 aa 7 ^ is a neutral aliphatic unsubstituted or 

thio-substituted amino acid, particularly alanine, leucine 

or methionine; 

78 

aa is a neutral aliphatic unsubstituted amino 

acid, particularly alanine or a neutral aliphatic amino ac . : 

30 having a thio- or an oxy-substitution, particularly 

methionine or threonine; 
83 

aa is a neutral aliphatic unsubstituted or oxy- 
substituted amino acid, particularly glycine, serine or 
threonine ; 

92 

35 aa is a neutral aliphatic amino acid with a 

oxy-substitution, particularly serine or threonine; 
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96 

aa is a neutral aliphatic thio-substituted 

amino acid or a neutral aromatic heterocyclic amino acid, 

particularly tryptophan; 
97 

aa is a neutral aliphatic unsubstituted or 

5 thio-substituted amino acid, particularly valine, leucine, 

iso leucine or methionine; 
98 

aa is a neutral aliphatic unsubstituted amino 

acid or an aromatic oxy- substituted amino acid, 

particularly alanine, leucine or tyrosine; 
99 

10 aa may be any amino acid; 

aa 100 is an oxy-substituted amino acid, either 
aliphatic or aromatic, particularly tyrosine or threonine; 

aa^ 1 is a neutral unsubstituted aliphatic or 
aromatic amino acid, particularly alanine, leucine or 

15 phenylalanine . 

Of particular interest as a source of DNA 
encoding sequences capable of providing for targeting to an 
oil body protein are oil-body protein genes obtainable from 
Arabidopsis or Brassica napus which provide for expression 

20 of the protein of interest in seed (See Taylor et al. (1990) 
Planta, 102:18*26). The necessary regions and amino-acid 
sequences to provide targeting ability to the oil body 
appear to be the highly hydrophobic central region of oil 
body proteins* 

25 To identify other oil body protein genes having 

the desired characteristics, where an oil body protein has 
been or is isolated, the protein may be partially sequenced, 
so that a probe may be designed for identifying mRNA. Such 
a probe is particularly valuable if it is designed to target 

30 the coding region of the central hydrophobic domain which is 
highly conserved among diverse species of plants. In conse- 
quence, a DNA or RNA probe for this region may be 
particularly useful for identifying coding sequences of oil 
body proteins from other plant species. To further enhance 

35 the concentration of the mRNA, cDNA may be prepared and the 
cDNA subtracted with mRNA or cDNA from non-oil body 
producing cells. The residual cDNA may then be used for 
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probing the genome for complementary sequences,, using an 
appropriate library prepared from plant cells. Sequences 
which hybridize the cDNA under stringent conditions may then 
be isolated. 

5 In some instances, as described above, using an 4 

oil body protein gene probe (conserved region), a probe may 
be employed directly for screening a cDNA genomic library ' 
and identifying sequences which hybridize to the probe. The 
isolation may also be performed by a standard immunological 

10 screening technique of a seed-specific cDNA expression 

library. Antibodies may be obtained readily for oil-body 
proteins using the purification procedure and antibody 
preparation protocol described by Taylor et al. (Planta, 
(1990) 181 r 18-26). cDNA expression library screening 

15 using antibodies is performed essentially using the 

techniques of Huynh et al. (1985, in DNA Cloning, Vol. 1, a 
Practical Approach, ed. D.M. Glover, IRL Press, pp. 49-78). 
Confirmation of sequence is facilitated by the high 
conservation found in the central hydrophobic region (see 

20 Fig. 1). DNA sequencing by the method of Sanger et al. 

(Proc. Natl. Acad. Sci. USA, (1977) 74:5463-5467) or Maxam 
and Gilbert (1980, Meth. Enzymol., (1980) 55:497-560) may be 
performed on all putative clones and homology searches 
performed. Homology of sequences encoding the central 

25 hydrophobic domain is normally I 70%, both at the amino-acid 
and nucleotide level between diverse species. If an 
antibody is available, confirmation of sequence identity 
may also be performed by hybrid-select and translation 
experiments from seed mRNA preparations as described by 

30 Sambrook et al. (Jfolecular Cloning, (1990) 2nd Ed., Cold 
Spring Harbor Press, pp. 8-49 to 8-51). 

cDNA clones made from seed can be screened using 
cDNA probes made from the conserved coding regions of any 
available oil body protein gene (e.g., Bowman- Vance and 

35 Huang (J. Biol. Chem. , (1987) 252:11275-11279). Clones are 
selected which have more intense hybridization with seed 
DNAs as compared to seedling cDNAs. The screening is 
repeated to identify a particular cDNA associated with oil 
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bodies of developing seeds using direct antibody screening 
or hybrid-select and translation. The mRNA complementary to 
the specific cDNA is absent in other tissues which are 
tested. The cDNA is then used for screening a genomic 
5 library and a fragment selected which hybridizes to the 
subject cDNA. 

To obtain expression of the chimeric gene in seed 
a transcriptional initiation regulatory region and 
translational initiation regulatory region of untranslated 

10 5' sequences, "ribosome binding sites", responsible for 
binding mRNA to ribosomes and translational initiation 
obtainable from any gene preferentially expressed in seed 
may be used. Examples of such genes include seed storage 
proteins such as from napin ( Josef sson et a!., J. Biol. 

15 Chem. , (1987) 252:12196-12201; Scofield S.R. and Crouch M.L. 
jr. Biol. Chem. (1987) 252:12202-12208). Preferably, the 
region is obtainable from an oil body protein (oil-body 
proteins from Axabidopsis, carrot (Hatzopoulos et al., 
supra) or maize (Huang et al. 1987 and 1990 supra ) . The 

20 region generally comprises at least 100 bp 5' to the 

translational start of the structural gene coding sequence, 
up to 2.5 kb 5' to the same translational start. It is 
preferred that all of the transcriptional and translational 
functional elements of the initiation control region are 

25 derived from or obtainable from the same gene. By 

"obtainable" is intended a DMA sequence sufficiently similar 
to that of a native sequence to provide for the desired 
specificity of transcription of the DNA sequence encoding 
the chimeric protein- It includes natural and synthetic 

30 sequences and may be a combination of synthetic and natural 
sequences . 

The transcription level should be sufficient to 
provide an amount of RNA capable of resulting in a modified 
seed. By "modified seed" is meant seed having a detectably 
35 different phenotype from a seed of a non-transformed plant 
of the same species, for example one not having the 
expression cassette in question in its genome. Various 
changes in phenotype are of interest. These changes include 
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over-expression of oil-body protein or OBP-accumulation on 
the oil body or in the cytoplasm of the resultant chimeric 
protein . 

The polypeptide of interest may be any protein 
5 and includes for example, an enzyme, an anticoagulant , a 
neuropeptide, a hormone, or adhesive precursor. Examples of 
proteins include interleukin-l-/J, the anticoagulant Hirudin, 
the enzyme /?- glucuronidase or a single-chain antibody 
comprising a trans lational fusion of the Vr or Vl chains of 

10 an immunoglobin. The DNA sequence encoding the polypeptide 
of interest may be synthetic, naturally derived, or 
combinations thereof. Depending upon the nature or source 
of the DNA encoding the polypeptide of interest,, it may be 
desirable to synthesize the DNA sequence with plant 

15 preferred codons. The plant preferred codons may be 
determined from the codons of highest frequency in the 
proteins expressed in the largest amount in the particular 
plant species of interest as a host plant. 

The termination region which is employed will be 

20 primarily one of convenience, since in many cases 
termination regions appear to be relatively 
interchangeable/ The termination region may be native with 
the transcrip-tional initiation region, may be native with 
the DNA sequence encoding the polypeptide of interest, or 

25 may be derived from another source. Convenient termination 
regions are available from the Ti-plasmid of A. tumefaciens, 
such as the oct opine synthase and nopaline synthase 
termination regions. 

Ligation of the DNA sequence encoding the 

30 targeting sequence to the gene encoding the peptide of 

interest may take place in various ways including terminal 
fusions, internal fusions, and polymeric fusions. In all 
cases, the fusions are made so as not to interrupt the 
reading frame of the oil-body protein and so as to avoid any 

35 translational stop signals in or near the junctions. The 
different types of terminal and internal fusions are shown 
in Fig. 2 along with a representation of their 
configurations in vivo. 
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In all the cases described, the ligation of the 
gene encoding the peptide preferably would include a linker 
encoding a protease target motif. This would permit the 
release of the peptide once extracted as a fusion protein. 
5 Potential cleavage sites which could be employed are 
recognition motifs for thrombin (leu-val-pro-arg-gly) 
(Fujikawa et al. f Biochemistry (1972) 11:4892-4899), of 
factor Xa (phe-glu-gly-arg-aa. ) (Nagai et al. r Proc. Nat'l 
Acad. Sci. USA, (1985) 82 : 7252-7255 ) or collagenase (pro- 
10 leu-gly-pro) (Scholtissek and Grosse Gene (1988) 62:55-64). 
By appropriate manipulations, such as 
restriction, chewing back or filling in overhangs to 
provide blunt ends, ligation of linkers, or the like, 
complementary ends of the fragments can be provided for 
15 joining and ligation. In carrying out the various steps, 
cloning is employed, so as to amplify the amount of DNA ai:d 
to allow for analyzing the DNA to ensure that the operations 
have occurred in proper manner. A wide variety of cloning 
vectors are available, where the cloning vector includes a 
20 replication system functional in E. coli and a marker which 
allows for selection of the transformed cells. Illustrative 
vectors include pBR332, pUC series, M13mp series, pACYC184, 
etc. Thus, the sequence may be inserted into the vector at 
an appropriate restriction site(s), the resulting plasmid 
25 used to transform the E. coli host, the E. coli grown in an 
appropriate nutrient medium and the cells harvested and 
lysed and the plasmid recovered. Analysis may involve 
sequence analysis, restriction analysis, electrophoresis 
the like. After each manipulation the DNA sequence to be 
30 used in the final construct may be restricted and joined to 
the next sequence, where each of the partial constructs may 
be cloned in the same or different plasmids. 

A variety of techniques are available for the 
introduction of DNA into plant cell host. For example, the 
35 chimeric DNA constructs may be introduced into host cells 
obtained from dicotyledenous plants, such as tobacco, and 
oleaginous species, such as Brassica napus using standard 
Agrobacterium vectors by a transformation protocol such as 
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that described by Moloney et al. Plant Cell Rep., (1989) 
3:238-242 or Hinchee et al. Bio/Technol. , (1988) 5:915-922; 
or other techniques known to those skilled in the art. For 
example, the use of T-DNA for transformation of plant cells 
5 has received extensive study and is amply described in EPA 
Serial No. 120,516; Hoekema, In: The Binary Plant Vector 
System Of fset-drukkerij Kanters B.V. , Alblasserdam, 1985, 
Chapter V, Knauf, et al. , Genetic Analysis of Host Range 
Expression by Agrobacterium, In: Molecular Genetics of the 

10 Bacteria. Plant interaction, Puhler, A. ed. r 

Springer-Verlag, NY, 1983, p. 245, and An et al. , EMBO J. 
(1985) r 4:277-284. Conveniently, explants may be cultivated 
with A. tumefaciens or A. rhizogenes to allow for transfer 
of the transcription construct to the plant cells. 

15 Following transformation using Agrobacteria the plant cells 
are dispersed in an appropriate selective medium for 
selection, grown to callus r shoots grown and plantlets 
regenerated from the callus by growing in rooting medium. 
The Agrobacterium host will contain a plasmid having the vir 

20 genes necessary for transfer of the T-DNA to the plant cells 
and may or may not have T-DNA. For injection and 
electroporation, (see below) disarmed Ti-plasmids (lacking 
the tumor genes, particularly the T-DNA region) may be 
introduced into the plant cell. 

25 The use of non-Agrobacterium techniques permits 

the use of the constructs described herein to obtain 
transformation and expression in a wide variety of 
monocotyledonous and dicotyledonous plants. These 
techniques are especially useful for species that are 

30 intractable in an Agrobacterium transformation system. 
Other techniques for gene transfer include biolistics 
(Sanford, Trends in Biotech. (1988) 5:299-302), 
electroporation (Fromm et al. (1985) Proc. Nat'l. Acad. 
Sci. USA, S2t5824-5828; Riggs and Bates (1986), Proc. 

35 Nat'l. Acad. Sci. (USA) £3 5602-5606 or PEG-mediated DNA 
uptake (Potrykus et al. (1985) Mol. Gen. Genet., 199:169- 
177) . 
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As a host cell, cells from any of a number of 
seed bearing plants may be employed in which the cells are 
derived from plant parts such as stem, leaf, root, or seed 
or reproductive structures according to the species. The 
5 cells may be isolated cells or plant parts, for example, 
leaf discs. In a specific application, such as to Brassica 
napus, the host cells generally will be derived from 
cotyledonary petioles as described by Moloney et al. Plant 
Cell Rep., (1989) 8:238-242). Other examples using 
10 commercial oil seeds include cotyledon transformation in 
soybean explants (Hinchee et al. Biotechnology, (1988) 
6:915-922) and stem transformation of cotton (Umbeck et al. 
Biotechnology, (1981) 5:263-266). 

Following transformation, the cells, for example 
15 as leaf discs, are grown in selective medium. Once shoots 
begin to emerge, they are excised and placed onto rootiny 
medium. After sufficient roots have formed, the plants are 
transferred to soil. Putative transformed plants are then 
tested for presence of a marker. Southern blotting is 
20 performed on genomic DMA using an appropriate probe, for 
example an A. thaliana oleosin gene, to show that 
integration of the desired sequences into the host cell 
genome has occurred. 

The expression cassette will normally be joined to 
25 a marker for selection in plant cells. Conveniently, the 
marker may be resistance to a herbicide, particularly an 
antibiotic, such as kanamycin, G418, bleomycin, hygromycin, 
chloramphenicol, or the like. The particular marker 
employed will be one which will allow for selection of 
30 transformed cells as compared to cells lacking the DNA 
which has been introduced. 

The fusion peptide in the expression cassette 
constructed as described above, expresses at least 
preferentially in developing seeds. Accordingly, 
transformed plants grown in accordance with conventional 
ways, are allowed to set seed. See, for example, McCormick 
et al. Plant Cell Reports (1986) 5:81-84. Northern blotting 
can be carried out using an appropriate gene probe with RNA 
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isolated from tissue in which transcription is expected to 
occur such as a seed embryo- The size of the transcripts 
can then be compared with the predicted size for the fusion 
protein transcript. 
5 Oil-body proteins are then isolated from the seed 

and analyses performed to determine that the fusion peptide 
has been expressed. Analyses can be for example by PAGE. 
The fusion peptide can be detected using an antibody to the 
oleosin portion of the fusion peptide. The size of the 

10 fusion peptide obtained can then be compared with predicted 
size of the fusion protein. 

Two or more generations of transgene plants may be 
grown and either pollinated with the same transformed strain 
or different strains, identifying the resulting hybrid 

15 having the desired phenotypic characteristic, to ensure that 
the subject phenotypic characteristic is stably maintained 
and inherited and then seeds harvested for isolation of the 
peptide of interest or for use to provide seeds with the new 
phenotypic property. 

20 The desired protein can be extracted from seed 

that is homo- or heterozygous for the introduced trait by a 
variety of techniques, including use of an aqueous, 
buffered extraction medium and a means of grinding, 
breaking, pulverizing or otherwise disrupting the cells of 

25 the seeds. The extracted seeds can then be separated (for 
example, by centrifugation or sedimentation of the brei) 
into three fractions: a sediment or insoluble pellet, an 
aqueous supernatant, and a buoyant "scum" comprising seed 
storage lipid and oil bodies. These oil bodies contain both 

30 native oil-body proteins and chimeric oil body proteins, the 
latter containing the foreign peptide. The oil-bodies are 
separated from the water- soluble proteins and re-suspended 
in aqueous buffer. 

If a linker comprising a protease recognition 

35 motif has been included in the expression cassette, to the 
resuspension buffer is added a protease specific for the 
recognition motif produced by translation of the linker 
sequence. This releases the required peptide into the 
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aqueous phase. A second centrif ugation step will now 
re-float the processed oil-bodies with their attached 
proteins and leave an aqueous solution of the required 
peptide. The desired peptide may be precipitated, 
5 chemically modified or lyophilized according to its 
properties and desired applications 

In certain applications it may not be necessary to 
remove the chimeric protein from the oil-body protein. Such 
an application would include cases where the fusion peptide 
10 includes an enzyme which is tolerant to N or C-terminal 
fusions and retains its activity; such enzymes could be 
used without further cleavage and purification. The 
chimeric enzyme-OBP would be contacted with substrate as a 
fusion protein. It is also possible, if desired, to purify 
15 the enzyme - OBP fusion protein using an immunoaf f inity 

column comprising an immobilized high titre antibody againsc 
the OBP (see, for example, Taylor et al., (1990) supra ) . 

Other uses for the subject invention are as 
follows. OBP's comprise a high percentage of total seed 
20 protein, thus it is possible to enrich the seed for certain 
desirable properties such as high-lysine, high methionine, 
and the like, simply by making the fusion protein rich in / 
the amino-acid(s) of interest could find utility of 
particular interest is the modification of grains and 
25 cereals which are used as either directly or indirectly as 
food sources for livestock, including cattle, poultry, and 
humans. It may be possible to include, as the fusion 
peptide, an enzyme which may assist in subsequent processing 
of the oil or meal in conventional oilseed crushing and 
30 extraction, for example inclusion of a thermostable lipid- 
modifying enzyme which would remain active at the elevated 
crushing temperatures used to process seed and thus add 
value to the extracted triglyceride or protein product. 
Other uses of the fusion protein to include use to improve 
35 the agronomic health of the crop. For example, an 

insecticidal protein or a portion of an immunoglobulin 
specific for an agronomic pest such as a fungal cell wall or 
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membrane, could be coupled to the oil body protein thus 
reducing attack of the seed by a particular plant pest. 

The following examples are offered by way of 
illustration and not by limitation, 

5 

EXPERIMENTAL 



Example 1 

Expression of terminal fusions of foreign peptides 
10 with oil-body proteins 

A. C-terminal fusions 

A genomic clone of an oil-body protein gene 
containing at least 100 bp 5' to the translational start is 

15 cloned into a plasmid vehicle capable of replication in a 

suitable bacterial host (e.g., pUC or pBR322 in E. coli) . A 
restriction site is located in the region encoding the 
hydrophilic C-terminal portion of gene. In a 19 kDa OBP, 
this region stretches typically from codons 125 to the end 

20 of the clone. The ideal restriction site is unique, but 
this is not absolutely essential. If no convenient 
restriction site is located in this region, one may be 
introduced by the site-directed mutagenesis procedure of 
Kunkel Proc. Nat'l. Acad. Sci. USA, (1985) 52:488-492. The 

25 only major restriction on the introduction of this site is 
that it must be placed 5' to the translational stop signal 
of the OBP clone. 

With this mutated clone in place, a synthetic 
oligonucleotide adapter may be produced which contains 

30 coding sequence for a protease recognition site such as 
Pro-Leu-Gly-Pro or a mul timer thereof. This is the 
recognition site for the protease collagenase. The adaptor 
would be synthesized in such a way as to provide: a 4-base 
overhang at the 5' end compatible with the restriction site 

15 at the 3' end of the OBP clone, a 4-base overhang at the 3' 
end of the adaptor to facilitate ligation to the foreign 
peptide coding sequence and additional bases, if needed, to 
ensure no frame shifts in the transition between the OBP 
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coding sequence , the protease recognition site and the 
foreign peptide coding sequence. A typical arrangement for 
such a fusion is shown in Fig. 3. The example shown here 
uses an existing Xhol site near the stop codon of a carrot 
5 OBP (Hatzopoulos et al. Plant Cell, (1990) 2:457-467). Thi.s 
is digested and may be ligated with an adapter constructed 
from the two oligonucleotides described. This adapter will 
form a perfect Xhol overhang at an end and will not disrupt 
the translational frame. The other end forms an Ncol 

10 overhang which is arbitrarily chosen (any six-base cutter 
will suffice), but which encloses an ATG from the desired 
foreign peptide. 

The final ligation product will contain an almost 
complete OBP gene, coding sequence for collagenase 

15 recognition motif and the desired peptide coding region all 
in a single reading frame. This tripartite fragment is 
cloned into an Agrobacterium binary plasmid (Bevan Nucl. 
Acid Res., (1984) 12:8711-8721) such as is widely used to 
transfer foreign DNA into plants (Fraley et al. Proc. Nat' 1 

20 Acad. Sci. USA, (1983) 80:4803-4807) and this is used to 
transform oilseed plants such as rapeseed using the method 
of Moloney et al. Plant Cell Rep., (1989) 0:238-242) or 
similar procedure. Transgenic plants may be recovered frow 
this transformation experiment and these are grown to 

25 flowering. The plants then set seed by self-fertili- 
zation. 

The seeds are allowed to reach maturity (60-80 
days) and then are harvested and ground in aqueous 
extraction buffer (Taylor et al. Planta, (1990) 181 z 18- 

30 26). The slurry is centrifuged at 5000 xg for 20 min. and 
will give a surface scum. This scum is again recovered and 
suspended by vigorous shaking in a collagenase assay buffer 
(Scholtissek and Grosse, Gene (1988) 52:55-64). Five units 
of collagenase are added and the suspension is incubated 

35 with shaking for 4 h. After this time, the suspension is 
once again centrifuged at 5000 xg for 20 min. The surface 
scum is removed and the protein content of the aqueous phase 
is analyzed by SDS-Poly Acrylamide Gel Electrophoresis. If 
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a band of approximately the size of the required peptide is 
found r the protein may be precipitated using ammonium 
sulfate , concentrated using ultrafiltration or lyophilized. 

5 B. N- terminal fusions 

The hydrophilic N-terminal end of oil-body 
proteins permits the fusion of peptides to the N-terminal 
while still assuring that the foreign peptide would be 
retained on the outer surface of the oil body* The 

10 configuration of such fusions is shown in Fig. 2IB. 

This configuration can be constructed from 
similar starting materials as used for C-terminal fusions , 
but requires the identification of a convenient restriction 
site close to the trans lational start of the oil -body 

15 protein gene. A convenient site may be created in many oil- 
body protein genes without any alteration in coding sequence 
by the introduction of a single base change just 5' to the 
first 'ATG' . In oil body proteins thus far studied, the 
second amino acid is alanine whose codon begins with a "G M 

20 The context of the sequences is shown below: 



25 



A-C transition here yields col site 



I 

3 ' . . . TC TCA ACA ATG GCA . . . Carrot OBP 

3' . . . CG GCA GCA ATG GCG * . . Maize 18KDa OBP 

30 A single base change at the adenine prior to the 'ATG' 
would yield in both cases . . . CCATGG . . . which is an 
Ncol site* Thus, modification of this base using the site- 
directed mutagenesis protocol of Kunkel (Proc. Nat'l. Acad. 
Sci. USA, (1985) 82t 488-492) will prepare this clone for use 

35 assuming no other Ncol sites in the sequence. 

The coding sequence for the foreign peptide may 
require preparation which will allow its ligation directly 
into the Ncol site. This may typically require a single or 
two-base modification by site-directed mutagenesis (Kunkel, 

40 1985 r supra) to generate an Ncol site around the 
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translational start of the foreign peptide. This peptide is 
then excised from its cloning vehicle using Ncol and a 
second enzyme which cuts close to the translational stop of 
the target. Again, using the methods described above, a 
second convenient site can be introduced by site-directed 
mutagenesis. It has been suggested by Qu and Huang (1990, 
supra) that the N-terminal methionine might be removed 
during processing of the protein in vivo and that the 
alanine immediately downstream of this might be acylated. 
To account for this possibility, it may be necessary to 
retain the Met-Ala sequence at the N-terminal end of the 
protein. This is easily accomplished using a variety of 
strategies which introduce a convenient restriction site 
into the coding sequence in or after the Ala codon. For 
example, by site-directed mutagenesis, the sequences could 
be modified as follows: 



3 ' ... TC TCA ACA ATG GCA GAA CGA GGC ACT TAT 
20 mutate to Narl 



25 



TC TCA ACA ATG GCA 



TGC 



CGA GGC 



3CC 



TAT 



SphI 



This change of a single codon would introduce a SphI site 
30 into the coding sequence. A second change, which could be 
introduced during the same round of mutagenesis would 
convert two bases in codon 6 to yield GGC GCC, an Narl site. 
This mutated gene could then be opened with SphI and Narl to 
give a directional cloning cut which would eliminate three 
35 codons. Into this site could be introduced an adaptor 
containing a 3 ' overhang with the sequence CATG . . . 
(compatible with SphI) and a GC 5' overhang at the opposite 
end. The precise sequence of this adapter is shown below: 
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Sphl Narl 
n times 



GG 
CCGC 







CCG 


CTC 


GGT 


CCG 


5 


CTACG 


GGC 


GAG 


CCA 


GGC 



This adapter would recreate both the Sphl and Narl 
restriction sites which would be used for diagnostic 
purposes. The Sphl site could now be used to open the 
10 plasmid and clone in-frame a DNA fragment enclosing the 
sequence for a useful peptide. Orientation of cloning 
could then be analyzed by cutting at any asymmetrically 
placed site and Narl of the plasmid. 

The resultant constructs from these N-terminal 
15 fusions would be typical of the examples IB of Figure 2. 
They would contain an OBP promoter sequence,, an in- frame 
fusion in the first few codons of the OBP gene of a high 
value peptide coding sequence with its own ATG as start 
signal if necessary and the remainder of the OBP gene and 
20 terminator. 

This modified gene is introduced into a binary 
Agrobacterium plasmid (Bevan, (1984), supra) and mobilized 
into Agrobacterxum. Transformations are performed as 
described above. Recovery of the high value peptide from 
25 seeds is performed as described for 'C-terminal fusions. ' 

C. Internal trans lat ional fusions 

A third type of fusion involves the placing of a 
high value peptide coding sequence internally to the coding 

30 sequence of the OBP. This type of fusion requires the same 
strategy as in N-terminal fusions, but may only be 
functional with modifications in regions of low 
conservation, as it is believed that regions of high 
conservation in these OBPs are essential for targeting of 

35 the mature protein. 

The key difference in this kind of fusion is the 
necessity for Hanking- collagenase recognition sites for the 
release of the protein. This means that in place of the 
standard collagenase linker/adapter systems thus far 
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described, it is necessary to have a linker with the 
following form: 

n times n times 

5 



Cohesive 


CCG 


CTC 


GGT 


CCG 


Restric- 


CCG 


CTC 


GGT 


end 1 


GGC 


GAG 


CCA 


GGC 


tion site 


GGC 


GAG 


CCA GGC 



10 

Cohesive ends 1 and 2 would be used to clone the adapter 
into an OBP clone in a directional manner. The nested 
restriction site is then used to introduce the high value 

15 peptide coding sequence flanked by appropriate restriction 
sites or linkers. Orientation is checked by the use of an 
asymmetrically placed restriction site in the high-value 
peptide coding sequence and one of the two restriction site? 
flanking the coding sequence of the collagenase recognition 

20 motif. 

Mobilization of these constructs to Agrobacteriwa 
plasmids and then to plants is identical to the previous 
described procedure.^ Recovery of the high-value protein 
from the seeds of transgenic plants is somewhat different in 

25 that after the oil-bodies have been isolated and washed, it 
may be necessary to delipidate the oil-bodies in order to 
access the collagenase recognition sites which could be 
hidden inside the oil-body in the lipid phase. This step 
may reduce certain advantages of using oil-body proteins as 

30 carriers , but may on the other hand be very convenient for 
protein sequences which are labile in aqueous media or in 
plant cytoplasms. 



D. Inter-dimer translational fusions 

35 Jt is possible to create a construct in which the 

entire coding sequence of the OBP is repeated. A dimeric 
protein produced from this construct may still contain all 
the necessary factors for targeting the OBP to the oil-body. 
Such a construct would contain a promoter region, an entire 

40 or near-complete open reading frame for an OBP but excluding 
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the translational stop and then an entire open-reading frame 
of a second. OBP, this time equipped with a translational 
'stop' and a terminator region. 

In the construction of this chimeric gene a pair 
5 of dissimilar restriction sites are either found or created 
at the region of the junction of the two copies. These 
sites are used to enable the introduction of a linker such 
as is described above for internal translational fusions. 
The linker contains not only sets of collagenase recognition 

10 motifs, but also an internal restriction site in which to 
nest a sequence encoding a high value protein. The form of 
this construct is shown in Fig. 2 III. Mobilization of this 
construct to Agrobacterlum and then to plants is exactly as 
above. Recovery of the high value protein from seeds of the 

15 transformed plants would be performed using the same 
procedure as described for C-terminal fusions above. 

Example 2 

Strategy for the cloning and expression of 
20 Interleukin-l-fl (IL-1-/7) as a fusion 

with oleosins in plants 

A. Cloning and sequencing of an Arabidopsis thaliana 
oleosin gene 

25 A Brassica napus oleosin gene (Murphy et al, (1991) 

Biochim Biophys Acta 1088 : 86-94) was used to screen a 
genomic library of A. thaliana (cv. Columbia) in EMBL3A 
(Stratagene) . The screening resulted in the isolation of a 
EMBLA3A clone (X2.1) containing a 15 kb genomic fragment 

30 from A. thaliana. The oleosin was mapped within a 6.6 kb 
Xpnl insert, within this 15 kb fragment (Fig. 5). A 1.8 kb 
Ncol/Kpnl fragment containing the oleosin gene was end 
filled and subcloned in the Smal site of RFM13mpl9. The 1.8 
kb insert was digested with convenient restriction enzymes 

35 and subcloned in M13mpl9 for sequencing. The 1800 bp 

sequence of the A. thaliana oleosin gene is presented in 
Fig. la. All the cloning procedures were carried out 
according to Sambrook et al., (1989) (Molecular Cloning: A 
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laboratory manual 2nd ed. Cold Spring Haber Laboratory 
Press- ) 

B. Design of an oligonucleotide encoding IL-1-/J 
5 IL-1-/J consists of 9 amino acids (aa); val-gln- 

gly-glu-glu-ser-asn-asp-lys (Antoni et al . , (1986) J. 
Immunol. 137:3201-3204). The protease factor Xa can cleave 
a protein sequence which contains a aa sequence ile-glu-gly- 
arg. Cleavage takes place after the aa arg. Based on these 

10 sequences an oligonucleotide was designed (GVR11, fig. 5), 
which contains in addition to the IL-1-/J coding sequence, 
the coding sequence for the factor Xa cleavage site, and 18 
nucleotides of the 3' coding region of the A. thaliana 
oleosin (base position 742-759). The IL-1-/J coding sequence 

15 was designed using optimal codon usage for the B. napus and 
A. thaliana oleosin (Table 1). 



^DGCID. <WC S32132GA1> 



SUBSTITUTE SHEET 



WO 93/21320 PCT/CA92/00161 

30 







t i i 


CM 


t£ 1 rH 1 


m 


m 


in po 


0 cr\ ^ in 


• 


















rH 






CM 
















on 




r* 


U U N 


*> 

> 


>w> rV* 

K K EC OS 


CQ 


CQ 


Pk K 


rn rfs m »n 

U U U U 


PO 




•r-l 
















rH 




in 


(fl (0 4j 




0* 0* o* 


u 


U 


en cp 


>1 >1 >1 >1 


1 




0 


>i >iCU 


u 


Vl M M M 


0) 


(1) 


W u 


rH rH rH rH 


m 




<D 


o o o 


+J 


ffl flB (0 flJ 


CO 


02 


<a a) 


0^ 


a\ 




iH 
















en 




0 


Eh CJ < 


CD 


E< CJ rf! CD 


&h 


O 


< CD 




rH 






u u o 


O 






O 




C5 CD O O 






CO 




Eh 


uouu 




< 


<< 


CD CD CD CD 


«• 




a 
















\o 




Q. 




















to 


1 


I 


^ so ^ 


1 


rH 


in 0 


GO 91 CN CN 




















rH 


rH 






X X fca 


N 


X X 010 


% 


z 




q a w w 


(d 
0 




CQ 






















W U X 




oq arc c 


C 


c 


n n 


Q1O49 >i 


to 




T5 


>i>iO 








ID 


>i>i 


CQ CD rH rH 






C 


+J +J o 




JC.C t*c* 


ca 


03 


HH 








(0 




! 


















Eh O < 




h u ^ a 








Eh O < CD 






rH 






3 < 


1 




11 


* < 3 < 






(0 


&h &h h 




0 0 a u 








CD CD O CD 




<D 


c 
















(0 


rH 


m 
















rH 


XI 


-H 


cooes 


1 




rH 




r* m 


CN O CM 


Ph 




■-I 


rH 






rH 


rH 




rH rH 




&H 


to 




















.c 


CQ CQ CQ 


CQ 


Qj Q* Qj Or 


EH 




jh eh 


<<<< 


rH 




















o> 






U U U 


U 


OOOO 


U 


M 


u u 


*fl (0 *0 *fl 


a\ 






ID 01 0 


0) 


H H M U 






-C-C 




rH 






W OS (0 


(0 


CU CU CU cu 


-P 


4-) 


+J +J 


(0 (0 (0 (0 








eh u cj 


< 


eh 0 < a 


EH 


u 


< a 


Eh U < CD 


< O* 




O 


u a u 




a a a u 


a 


u 


CJ cj 


a u u a 


rH C 






H H H 




u a 0 u 


< 


< 


< < 


CD CD CD CD 


m 




0) 
















0) 3 




o 
















M X 




td 


in co i 




O «H 1 U> 




m 


PO rH 


HO\ 1 1 


S 




(0 






rH rH 




rH 


rH 


rH 


0»T3 




0 














> > > > 


-H C 






fa J 




HH rH rH rH 


M 


M 


m 2! 


fa (0 




c 

o 


0) a) 3 


0 


O 3 3 3 


0) 


<D 


CD +J 


rH rH rH rH 


CD 0) 




•a 


.C.C 0) 


CD 


0) 0J <U 0) 


rH 


rH 


rH (1) 


nj <d *o *o 


0) 0) 




0 




iH 


rH rH rH iH 




«H 


•H g 


> > > > 


CQ i4 




a 


eh u << 


CD 


eh a < cd 


eh 


CJ 


< a 


Eh O < CD 


rH CM 






e ^ en 


H 


H B H Eh 
















&H H Eh 


E* 


U U U U 


s 


< 


<< 


CD CD CD CD 





\SDOCID. <WG 532132GA1> 



SUBSTITUTE SHEET 



WO 93/21320 



PCT/CAM/00161 



31 

C. Creation of an A. thaliana oleosin-IL-1-/) fusion 

Based on the sequence: 

5' CACACCAGGAACTCTCTGGTAAGC 3' 
5 (base position: -838 to -814), oligonucleotide GVR10 

5' C ACTGCAGG AACTCTCTGGTAAGC 3' 
was designed. GVR10 contains a PstI restriction site 
(underlined) to facilitate cloning. The polymerase chain 
reaction (PCR) was used amplify the region between GVR10 and 
10 GVR11. The reaction mixture contained: 16 fil dNTPs (1.25 
mM), 10 fil 10X PCR buffer (100 mM Tris-HCL pH 8.3, 500 mM 
KCL f 15 mM MgCl2/ 0.1% (w/v) gelatin), 5 pi GVR11 (20 yxM) 
1 fil Taq DNA polymerase (1 u//il) and 64 fil H20. The 
reaction was carried out for 30 cycles. Each cycle 
15 consisted of 1 minute denaturing at 92 °C, 1 minute annealing 
at 45°C and 3 minutes extension at 72°C. The PCR reaction 
yielded a single fragment of 1652 nucleotides. 

D. Cloning of the A. thaliana oleosin-IL-l-/J (OBPIL) 
20 fusion 

A 5' Sall-nopaline synthase (nos) terminator- 
rcoRI 3' sequence was isolated from pBI121 (Clontech 
laboratories) and cloned into the Sal/EcoRI sites of pUC19 , 
The plasmid was called pTexm. The 1652 bp fragment 

25 (described in C.) was isolated and digested with the 

restriction enzymes PstI and Sail. This fragment was cloned 
in pTernu The resulted plasmid was called pUCOBPILT (fig. 
5). This plasmid was digested with £coRI and PstI and 
resulted in the digested pUC19 vector and the EcoRI-A* 

30 thaliana oleosin-IL-l-/J -nos -PstI fusion PstI (OBPILT). 

The complete sequence of OBPILT is shown in fig. 7. OBPILT 
was subcloned in the EcoRI/PstI sites of pBluescript+ . This 
plasmid (pBIOBPILT) was digested with PstI and Hindi I I and 
the Pstl-OBPILT-Hindlll fragment was subcloned in a binary 

.35 Agrobacterium plasmid (Bin 19) (Bevan, M. , (1984) Nucl. 
Acid. Res. 12: 8711-8721) containing a selection marker 
(neomycin phosphotransferase and Pstl-Hindlll unique sites. 
The resulting plasmid was called pCGOBPILT. A schematic 
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representation of the cloning procedure is shown in Fig. 5. 
For descriptions of various binary plasmids, see , pGA642 or 
645; An et ai.'(1985) EMBO. J. £ 277*288 or pCGN1558 or 
1559; MacBride and Summerfeldt (1990) Plant. Molec. Biol. 14 
5 269-276. 

F. Transformation of pCGOBPILT into Agrobacterlum strain 
EHA101 

A single EHA101 colony (Hood et al. r (1986) J. 

10 Bact. 158:1291-1301) was used to inoculate 5 ml of LB+100 
/*g/ml kanamycin. This culture was grown for 48 hours at 
28°C. This 5 ml culture was used to inoculate 500 ml of 
LB+100 fig/ml kanamycin. This culture was grown at 28°C 
until the culture reached a density of OD600=0.5 (approx. 4 

15 hours). The cells were spun down (10 min, 5000 x g) and 
resuspended in 500 ml of sterile H20 (repeated 2x) . The 
cells were spun again and resuspended in 3 ml sterile H2O, 
containing 10% glycerol. 40 fil of the cells were aliquoted 
in Eppendorf tubes and either directly used for 

20 electroporation, or stored at -80°C for future use. 

Electroporation was carried out according to Bower et al. r 
Wizcl. Acid. Res. (1988) 16 6127-6145. The pulse generator 
was set to the 25 /iF capacitor, 2.5 kV and 200 ohm in 
parallel with the sample chamber. 

25 

G. Transformation of Kicotlana tabacum (tobacco) with 
pCGOBPILT 

The EHA 101 containing pCGOBPILT was used to 
transform tobacco leaf discs. Eight to ten centimeter long 

30 tobacco leaves were taken from greenhouse grown plants, 

sterilized in 70% ethanol for 20 sec. and then in 10% bleach 
(such as Javex ) for 8 min. The leaves were then rinsed 6 
times with sterile water. The leaf edges as well as the 
midrib were excised from the leaves and the remaining lamina 

35 was sectioned into 5x7 mm squares or discs of 5 mm diameter. 
About 30 leaf discs were collected and placed into a small 
petri dish. The Agrobacterium solution was then poured over 
the tobacco discs and incubation occurred for 9 minutes . 
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The leaf pieces were then blotted on sterile Whatman filter 
paper and placed, abaxial side down, onto media I (MS, 3% 
sucrose and 2 mg/1 2,4-D). Co-cultivation proceeded for 
the following 48 hours. At this point the leaf discs were 
5 transferred to selection media (MD, 3% sucrose, 2.5 mg/1 Ba, 
0.1 mg/1 NAA, 500 mg/1 carbenicillin, and 100 mg/1 
kanamycin) where they remained for the next 3-4 weeks. Once 
shoots began to emerge they were excised and placed onto 
rooting media (MS, 3% sucrose, 0.1 mg/1 NAA, 500 mg/1 
10 carbenicillin, and 50 mg/1 Jcanamycin). After sufficient 
roots had been formed, the tobacco plants were transferred 
to soil. 



H* Transformation of B. napus with pCGOBPILT 
15 The transformation of B. napus was carried out 

according to Moloney et al., (1989) Plan Cell Rep, fi:238 
242 , which disclosure is incorporated herein by reference. 
Transformation Procedure 

Single colonies of Agrobacterium tumefacxens 

20 strain EHA 101 containing the binary plasmid were grown 
overnight at 28°C in AB medium. A 50/il sample of this 
suspension was grown overnight at 28 °C in 5 ml of MG/L 
broth supplemented with appropriate antibiotics. This 
bacterial suspension was pelleted by centrifugation for 15 

25 min. at 10 , 000 x g then resuspended in 10 ml of MS medium 
containing 3% sucrose and at pH 5.8. A thin film of this 
suspension was used to cover the base of a 5 cm petri dish. 
Individual excised cotyledons were taken from the plates 
described above and the cut surface of their petioles was 

30 immersed into this bacterial suspension for a few seconds. 
They were immediately returned to the same MS plates from 
which they had been taken. The cotyledons were co- 
cultivated with the Agrobacterium for 72 h. No feeder 
layers were employed. 

35 After co-cultivation, the cotyledons were 

transferred to regeneration medium comprising MS medium 
supplemented with 20/iM benzyladenine, 3% sucrose, 0.7% 
phytagar, pH 5.8 and 500 mg/1 carbenicillin (Pyopen, 
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Ayerst) and 15 mg/1 kanamycin sulphate (Boehringer- 
Mannheim) . Again the petioles were car fully embedded in 
the agar to a depth of 2 mm. Plating density was 
maintained at 10 explants per plate. Higher densities 
5 reduce regeneration frequency. 



Selection and Plant Regeneration 

The explants were maintained on regeneration 

10 medium under light and temperature conditions specified 

above for 2-3 weeks. During this time many shoots appeared 
on over half the explants with relatively little callus 
formation- Some of these shoots undergo bleaching by the 
fourth week of culture. The remaining green shoots were 

15 subcultured onto shoot elongation medium which was the same 
as regeneration medium minus the benzyladenine . One or two 
weeks on this medium permitted the establishment of apical 
dominance from the shoot clusters formed. The shoots so 
derived were transferred to -rooting- medium containing MS 

20 medium, 3% sucrose, 2mg/l indole butyric acid, 0.7% phytagar 
and 500 mg/1 carbenicillin. No kanamycin was used at this 
stage as it was found that more rapid root establishment 
occurred without the selection agent while very few 
"escapes" actually succeeded in rooting after the two 

25 rounds of selection on regeneration and shoot elongation 
medium. 

I. Stable integration of OBPILT in the tobacco and 
fi. napus genomes 

30 Putative transformed plants were tested for 

neomycin phosphotransferase activity. Genomic DNA from 
plants showing this activity was isolated. Southern 
blotting was performed in order to demonstrate that the 
sequences between the T-DNA borders (OBPILT and neomycin 

35 phosphotransferase gene) were stably integrated in to the 
genomes of B. napus and tobacco. The tobacco Southern was 
probed with the A. thaliana oleosin gene, and the neomycin 
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phosphotransferase gene. The B . napus Southern was probed 
with the neomycin phosphotransferase gene. 

J. Expression of the oleosin-IL-1-/} fusion in tobacco 
5 plants 

RNA was isolated from developing embryos obtained 
from transformed and untrans formed plants. Northern 
blotting was carried out using the A. thaliana oleosin at a 
gene probe* In all the tested transformed plants a 850 nt 
10 transcript could be detected. The size of these transcripts 
correspond to the expected size of the oleosin-IL-l-/J mRNA. 
These transcripts could not be detected in the untrans formed 
plants . 

15 K. Accumulation of the oleosin-IL-l-/J protein 

Oil-body proteins were isolated from transformed 
tobacco seeds (Holbrook et al., (1991) Plant Physical 
57:1051-1058. PAGE was performed and the protein were 
transferred from the gel to PVDF membranes. An antibody, 

20 which was raised against a 22 kDa oleosin of B. napus, was 
used to detect the oleosin- IL-1-/J fusion in the tobacco 
seeds. This antibody recognizes all the major oleosins in 
B. napus and A. thaliana. In addition, this antibody 
recognizes the tobacco oleosins. Tobacco oleosins have 

25 different sizes from the A. thaliana and J3. napus oleosins. 
In the transformed tobacco seeds the anti-22 kDa antibody 
recognized a 20 kDa-protein, which was not present in the 
untransformed tobacco seed. The predicted size of the 
oleosin-IL-1-0 fusion is 20.1 kDa. A summary of the results 

30 is shown in Table 2. 

By expressing a peptide of interest conjugated to 
an oil body protein, or a sufficient portion thereof to 
provide for getting to the oil bodies, the peptide of 
interest can be easily purified so as to be substantially 

35 free of other cellular components. The fusion protein can 
be cleaved following purification or may be used without 
cleavage. The subject methods and compositions provide a 
fast, simple method for purifying a polypeptide of i 
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All publications and patent applications 
mentioned in this specification are herein incorporated by 
reference to the same extent as if each individual 
5 publication or patent application was specifically and 
individually indicated to be incorporated by reference. 

The invention now being fully described , it will 
be apparent to one of ordinary skill in the art that many 
changes and modifications can be made thereto without 
10 departing from the spirit or scope of the appended claims. 
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WHAT IS CLAIMED IS ; 

1. A polypeptide, characterized as capable of 
targeting an oil body, and having the formula 

5 pp 1 - aa 25 - aa 26 - V _ V - T - L- aa 21 - P- 

A A A T 

aa34_ g- G- aa36 _ l- L- aa 39 - L- aa 41 - 
10 M 

G - I _ aa 44 - L- aa 46 - aa 47 - T _ L - I _ 
S L S V V 



15 



20 



30 



35 



aa 51 - L- aa 53 - V- A- T- P- L - a a 59 - L 



L _ f- S- P- V _ L _ v- P- A- A _ L _ a a 73 - 
I I L I 

25 aa 74 - aa 75 - aa 76 - aa 77 - aa 78 - g- F- L- 



S - S- aa 87 - G- V - aa 89 - aa 90 - L - S - 



aa 93 aa 94 - S - aa 96 - aa 97 - aa 98 - aa 99 
T 



100 101 2 

aa - aa - pp 



with the proviso that said peptide is other than the full 
length naturally occurring 16Kd oleosin from carrot or the 
40 18 Kd or 16 Kd oleosin from maize. 



2. The polypeptide according to Claim 1, wherein 
at least one of ppl and PP2 comprises a polypeptide of 
interest . 

45 

3. The polypeptide according to Claim 1, wherein 
at least one of ppl and Pp2 comprises an antigenic amino 
acid sequence to provide an immunogen. 
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4 . A polypeptide characterized as capable of 
targeting an oil body selected from the group consisting 
of: 

(a) a peptide comprising at least eight consecutive 
5 amino acids included in the following amino acid sequence: 

10 20 
M-N-G-R-D-R-D-Q-Y-Q-M-S-G-R-G-S-D-Y-S-K- 

30 40 
S-R-Q-I-A-K-A-A-T-A-V-T-A-G-G-S-L-L-V-L- 
10 SO 60 

L-S-L-T-L-V-G-T-V-I-A-L-T-V-A-T-P-L-L-V- 

70 80 
I-S-S-T-I -L-V-P-A-L- 1 -T-V-A-L-L- 1 -T-G-S- 

90 100 
15 L-S-S-G-G-F-G-I-A-A-I-T-V-F-S-W-I-Y-K*Y- 

110 120 
L-L-I-E-H-P-Q-G-S-D-K-L-D-S-A-R-M-K-L-G- 

130 140 
S-K-A-Q-D-L-K-D-R-A-Q-Y-Y-G-Q-Q-H-T-G-W- 
20 150 

E-H-D-R-D-R-T-R-G-G-Q-H-T-T ; and 

(b) a peptide that is encoded by a DMA sequence 
identified by means of an oligonucleotide probe designed 
25 based upon said amino acid sequence in (a) or a fragment 
thereof, with the proviso that said peptide is other than 
the full length naturally occurring 16Kd oleosin from carrot 
or the 18 Kd or 16 Kd oleosin from maize. 



30 5. The polypeptide according to Claim 4, 

wherein said peptide in (a) comprises at least twelve 
consecutive amino acids included in the following amino 
acid sequence: 
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30 40 
A-K-A-A-T-A-V-T-A-G-G-S-L-L-V-L- 
50 60 
L-S-L-T-L-V-G-T-V-I-A-L-T-V-A-T-P-L-L-V- 
5 70 80 

I-S-S-T-I-L-V-P-A-L-I-T-V-A-L-L-I-T-G-S- 

90 100 
L-S-S-G-G-F-G-I-A-A-I-T-V-F-S-W-I-Y-K*Y- 
101 

10 L. 

6. A DNA construct comprising a DNA sequence 
encoding (a) an oleosin or a portion thereof sufficient to 
provide for targeting to an oil body and (b) a polypeptide 

15 of interest* ' . - 

7. The DNA construct according to Claim 6, 
further comprising: 

vector DNA containing at least one regulatory 
20 sequence operatively associated with said DNA sequence 

which is capable of directing replication of said DNA in a 
host cell. 

8. The DNA construct according to Claim 7 , 
25 wherein said regulatory sequence is further capable of 

directing expression of said DNA in a host cell. 

9. The DNA construct according to Claim 6, 
wherein said DNA is cDNA. 

30 

10. An expression cassette comprising: 

as components, in the direction of transcription, 
a first DNA sequence comprising a sufficient portion of the 
region 5' to the translational start site of a gene 
35 expressed in seed to provide for expression of a DNA 

sequence in seed; a second DNA sequence encoding an oleosin 
or a sufficient portion thereof to provide for targeting to 
an oil body, said second DNA sequence including at least one 
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natural or synthetic restriction site; and a translational 
and transcriptional termination region; wherein said 
components are operably linked and expression of said 
second DNA sequence is regulated by said first DNA 
5 sequence . 



11. The expression cassette according to Claim 
10 , wherein said first DNA sequence is from a gene 
expressed in a cereal or grain seed cell, 

10 

12. The expression cassette according to Claim 
10, wherein the genome of Aradopsis thaliana comprises at 
least one of said first DNA sequence and said second DNA 
sequence. 

15 

13. The expression cassette according to Claim 
10, wherein said second DNA sequence encodes a peptide from 
the group consisting of: 

(a) a peptide comprising at least eight amino 
20 acids of and up to the full sequence of the following amino 
acid sequence: 

10 20 
M-N-G-R-D-R-D-Q-Y-Q-M-S-G-R-G-S-D-Y-S-K- 

30 40 
2 5 S-R-Q- 1 -A-K-A-A-T-A-V-T-A-G-G-S-L-L-V-L- 

50 60 
L-S-L-T-L-V-G-T-V-I-A-L-T-V-A-T-P-L-L-V- 

70 80 
I-S-S-T-I-L-V-P-A-L-I-T-V-A-L-L-I-T-G-S- 
30 90 ioo 

L-S-S-G-G-F-G-I-A-A-I-T-V-F-S-W-I-Y-K*Y- 
110 120 

L-L-I-E-H-P-Q-G-S-D-K-L-D-S-A-R-M-K-L-G- 
130 140 
35 S-K-A-Q-D-L-K-D-R-A-Q-Y-Y-G-Q-Q-H-T-G-W- 

150 

E-H-D-R-D-R-T-R-G-G-Q-H-T-T ; and 
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(b) a peptide that is encoded by a DNA sequence 
identified by means of an oligonucleotide probe designed 
based upon said amino acid sequence in (a) or a fragment 
thereof. 

5 

14. An expression cassette comprising: 

an oil body protein (OBP) gene which includes a 
sufficient portion of the region 5' to the translational 
start site to provide for expression of said gene in a seed 
10 cell and which includes at least one restriction site 

between just 5' to the codon for the initiating methionine 
and 5' to the translational stop signal of said OBP gene. 

15. The expression cassette according to Claim 
15 14 , further comprising a DNA sequence encoding a 

polypeptide of interest inserted into said restriction site 
in reading frame with said OBP gene. 

16. The expression cassette according to Claim 
20 14 , wherein said restriction site is a synthetic 

restriction site. 

17. The expression cassette according to Claim 

14 t further comprising an oligonucleotide adapter coding for 
25 a protease recognition site inserted into said restriction 
site. 

18. The expression cassette according to Claim 
17 , wherein said protease is collagenase. 



30 



35 



19. An expression cassette comprising: 
a first DNA sequence encoding a polypeptide of 
interest inserted in reading frame into an oil body protein 
(OBP) gene which includes a sufficient portion of the 
regulatory region 5' to the translational start site of said 
OBP gene to provide for expression of said gene in seed, 
wherein said sequence is inserted at a site in said gene so 
as to be expressed under said regulatory region. 
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20. The expression cassette according to Claim 
19, further comprising a second DNA sequence encoding a 
protease recognition site 5' to said first DNA sequence, 

5 wherein said second DNA sequence is in reading frame with 
said first DNA sequence and said OBP gene. 

21. A method for obtaining expression of a 
polypeptide of interest in seed, said method comprising: 

10 transforming a host plant cell with an expression 

cassette under genomic integration conditions, wherein said 
expression cassette comprises as components, in the 
direction of transcription, a first DNA sequence comprising 
a sufficient portion of the region 5' to the translational 

15 start site of a gene expressed in seed to provide for 

expression of a DNA sequence in seed; a second DNA sequeu^c 
encoding an oleosin or a sufficient portion thereof to 
provide for targeting to an oil body, said second DNA 
sequence including at least one natural or synthetic 

20 restriction site into which is inserted in reading frame a 
third DNA sequence encoding a polypeptide of interest; and a 
translational and transcriptional termination region; 
wherein said components are operably linked and expression 
of said second DNA sequence is regulated by said first DNA 

25 sequence to provide for expression in seed. 

22. The method according to Claim 21, wherein 
the genome of Thaliana aridopsis7 comprises at least one ^ 
said first and said second DNA sequence. 

30 

23. A method for obtaining expression of a 
polypeptide of interest in seed, said method comprising: 

transforming a host plant cell with a DNA 
construct under genomic integration conditions, wherein 
35 said DNA construct comprises a first DNA sequence encoding a 
polypeptide of interest inserted in reading frame into an 
oil body protein (OBP) gene which includes a sufficient 
portion of the regulatory region 5' to the translational 
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start site of said OBP gene to provide for expression of 
said gene in seed, wherein said sequence is inserted at a 
site in said gene so as to be expressed under said 
regulatory region, whereby said DNA construct becomes 
5 integrated into the genome of said plant cell; and 

growing said plant to produce seed whereby said 
polypeptide of interest is expressed as a fusion protein 
with the expression product of said OBP gene. 

10 24. The method according to Claim 23, further 

comprising: 

isolating said fusion protein from oil bodies in 
cells of said seed. 



15 25. The method according to Claim 24 r wherein 

said isolating comprises: 

lysing cells of said seed to release said oil 
bodies; and 

disrupting said oil bodies whereby said fusion 
20 polypeptide is released. 

26. The method according to Claim 25, wherein 
said isolating further comprises : 

contacting said fusion polypeptide with a 
25 protease capable of recognizing a protease recognition site 
in said fusion polypeptide located prior to the N-terminus 
of said polypeptide of interest. 

27. The method according to Claim 26, further 
3 0 compris ing : 

prior to said contacting, binding said fusion 
protein to a solid support comprising an antibody capable of 
binding to the expression product of said OBP gene. 

35 28. A method for obtaining a purified 

polypeptide of interest, said method comprising: 

transforming a host plant cell with a DNA 
construct under genomic integration conditions,, wherein 
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said DNA construct comprises a first DNA sequence encoding a 
polypeptide of interest inserted in reading frame into an 
oil body protein (OBP) gene which includes a sufficient 
portion of the regulatory region 5' to the translational 
5 start site of said OBP gene to provide for expression of 
said gene in seed, wherein said sequence is inserted at a 
site in said gene so that expression of said DNA sequence is 
controlled by said regulatory region, whereby said DNA 
construct becomes integrated into the genome of said plant 
10 cell; 

growing said plant to produce seed whereby said 
polypeptide of interest is expressed as a fusion protein 
with the expression product of said OBP gene; 

isolating oil bodies from the cells of said seed; 
15 disrupting said oil bodies whereby said fusion 

protein is released; and 

purifying said polypeptide of interest. 



29. The method according to Claim 28 f wherein 
20 said polypeptide of interest is other than a polypeptide 

encoded by a plant genome. 

30. The method according to Claim 28, wherein 
said polypeptide of interest is other than a polypeptide 

25 naturally present in an oil body. 

31. The method according to Claim 30, wherein 
said isolating comprises: 

collecting an oil-body fraction following lysis of 
30 cells from said seed. 

32. A plant cell comprising: 

an expression cassette according to any one of 
Claims 10, 14, or 19. 

35 

33. A plant comprising cells containing an 
expression cassette according to any one of Claims 10, 14, 
or 19. 
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34. A plant expressing a polypeptide of interest 
in seed, obtained according to the method of Claim 23. 

5 35 . Seed comprising an expression cassette 

according to any one of Claims 10 f 14 f or 19. 

36. Seed expressing a polypeptide of interest, 
obtained according to the method of Claim 23. 

10 

37. A method for obtaining a polypeptide of 
interest in an oil body, said method comprising: 

expressing said polypeptide in seed as a fusion 
protein with an oleosin, or a sufficient portion thereof to 
15 provide targeting to said oil body. 
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Xho I 

AGC ACT GCT CGA GAG ACT TCA AGG ACT Carrot OBP 

(see Hatzopoulos et 
Ser Thr Ala Arg Asp Thr Ser Arg Thr.... al. 1990) 

Xho I n times Nco I 

3",. AGC ACT GC TCGA CCG CTC GGT CCG GC 

TCG TGA CGAGCT GGC GAG CCA GGC CGGTAC 

Pro Leu Gly Pro 
Collagenase recognition motif 



FIG. 3 . 
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Construct fusion gene using protease (eg collagenase) target 
motif coding linkers clone in E . coli compatible plasmid 

i 

Insert construct into wide host range replicon containing T-DNA 
borders (i.e. Agrobacterium binary vector) 



Transform plant-cells using leaf, stem, cotyledonary 
or petiole explants 



Regenerate transgenic plants 

i 

Allow to set seed 

I 

Grind seed in aqueous extraction buffer 
(Taylor et al, 1990) 

i 

Centrifuge or otherwise separate oil-bodies 

i 

Suspend washed oil-bodies in collagenase assay buffer 
and add 5 units purified collagenase 

i 

Centrifuge to separate out residual oil-bodies 

\ 

Isolate released peptide from protease treatment 
by ammonium sulfate precipitation, column chromatography etc 

i 

Perform biological assay on released recombinant peptide 

FIG. 4. 
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INTERLEUKIN gvrll (71-mer) 

a) A. thaliana oleosin Factor Xa 

gly gly gin his thr thr ala ile glu gly arg val gin 
gly glu glu ser asn asp lys OCH val asp 

GGT GGC CAG CAC ACT ACT GCT ATC GA A GGG AGA GTT CAG 

Pvul 

GGA GAA GAA TCT AAC GAC AAG TAA GTC GAC GG 

Sail 



b) 3' CCA CCG GTC GTG TGA TGA CGC TAG CTT CCC TCT CAA GTL 

CCT CTT CTT AGA TTG CTG TTC ATT CAG CTG CC 5' 
] 



FIG. 6. 
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CACTGCAGGAACTCTCTGGTAAGCTAGCTCCACTCCCCAGAAACAACCGGCGCCAAATTGCC 
GGAATTGCT 

GACCTGAAGACGGAACATCATCGTCGGGTCCTTGGGCGATTGCGGCGGAAGATGGGTCAGCT 
TGGGCTTGAG 

GACGAGACCCGAATCGAGTCTGTTGAAAGGTTGTTCATTGGGATTGTATACGGAGATTGGTC 
GTCGAGAGG 

TTTGAGGGAAAGGACAAATGGGTTTGGCTCTGGAGAAAGAGAGTGCGGCTTTAGAGAGAGAA 
TTGAGAGGTT 

TAGAGAGAGATGCGGCGGCGATGACGGGAGGAGAGACGACGAGGACCTGCATTATCAAAGCA 
GTGACGTGGT 

GAAATTTGGAACTTTTAAGAGGCAGATAGATTTATTATTTGTATCCATTTTCTTCATTGTTC 
TAGAATGTCG 

CGGAACAAATTTTAAAACTAAATCCTAAATTTTTCTAATTTTGTTGCCAATAGTGGATATGT 
GGGCCGTATA 

GAAGGAATCTATTGAAGGCCCAAACCCATACTGACGAGCCCAAAGGTTCGTTTTGCGTTTTA 
TGTTTCGGTT 

CGATGCCAACGCCACATTCTGAGCTAGGCAAAAAACAAACGTGTCTTTGAATAGACTCCTCT 
CGTTAACACA 

TGCAGCGGCTGCATGGTGACGCCA1 TAACACGTGGCCTACAATTGCATGATGTCTCCATTGA 
CACGTGACTT 

CTCGTCTCCTTTCTTAATATATCTAACAAACACTCCTACCTCTTCCAAAATATATACACATC 
TTTTTGATCA 

ATCTCTCATTCAAAATCTCATTCTCTCTAGTAAACAAGAACAAAAAAATGGCGGATACAGCT 
AGAGGAACCC 

ATCACGATATCATCGGCAGAGACCAGTACCCGATGATGGGCCGAGACCGAGACCAGTACCAG 
ATGTCCGGAC 

GAGGATCTGACTACTCCAAGTCTAGGCAGATTGCTAAAGCTGCAACTGCTGTCACAGCTGGT 
GGTTCCCTCC 

TTGTTCTCTCCAGCCTTACCCTTGTTGGAACTGTCATAGCTTTGACTGTTGCAACACCTCTG 
CTCGTTATCT 

TCAGCCCAATCCTTGTCCCGGCTCTCATCACAGTTGCACTCCTCATCACCGGTTTTCTTTCC 
TCTGGAGGGT 

TTGGCATTGCCGCTATAACCGTTTTCTCTTGGATTTACAAGTAAGCACACATTTATCATCTT 
ACTTCATAAT 

TTTGTGCAATATGTGCATGCATGTGTTGAGCCAGTAGCTTTGGATCAATTTTTTTGGTCGAA 
TAACAAATGT 

AACAATAAGAAATTGCAAATTCTAGGGAACATTTGGTTAACTAAATACGAAATTTGACCTAG 
CTAGCTTGAA 

TGTGTCTGTGTATATCATCTATATAGGTAAAATGCTTGGTATGATACCTATTGATTGTGAAT 
AGGTACGCAA 

CGGGAGAGCACCCACAGGGATCAGACAAGTTGGACAGTGCAAGGATGAAGTTGGGAAGCAAA 
GCTCAGGATC 

TGAAAGACAGAGCTCAGTACTACGGACAGCAACATACTGGTTGGGAACATGACCGTGACC 
ACTCGTGGTG 

GCCAGCACACTACTGCGATCGAAGGGAG AGTTCAGGGAGAAGAATCTAACGACAAG TAAGTC 
GACTCTAG 

ACGGATCTCCCgatcgttcaaacatttggcaataaagtttcttaagattgaatcctgttgcc 
ggtcttgcga 

tgattatcatataatttctgttgaattacgttaagcatgtaataattaacatgtaatgcatg 
acgttattta 

tgagatgggtttttatgattagagtcccgcaattatacatttaatacgcgatagaaaacaaa 
atatagcgcg 

caaactaggataaattatcgcgcgcggtgtcatctatgttactagatcGGAATTC 
EcoRI 



FIG. 7. 
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